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1.  Introduction 


Combat  requires  systems  that  respond  rapidly,  efficiently,  and  safely  while 
attaining  mission  objectives  in  situations  that  are  increasingly  complex  (Barnes 
et  al.  2014).  “Smart”  technologies  are  becoming  ubiquitous  in  the  modem  world, 
changing  the  relationship  of  humans  to  their  machines  (. Economist ,  2016).  In 
particular,  autonomous  systems  are  being  developed  for  a  wide  variety  of  civilian 
and  military  applications  to  improve  safety  and  reduce  manpower  restrictions 
(Purdy  2008;  Greenemeier  2010;  Murphy  and  Burke  2010;  Osbom  2011;  O’Dell 
2013;  Atherton  2015;  Pellerin  2015;  Brewster  2016).  Before  proceeding  with  the 
discussion  of  autonomous  systems,  it  is  important  to  note  that  using  the  term 
“autonomy”  in  military  environments  is  possibly  misleading.  The  dictionary 
definition  of  autonomy  is  “not  subject  to  control  from  outside;  independent, 
existing  and  functioning  as  an  independent  organism”  (Dictionary.com  2016). 
Because  of  the  uncertainties  of  the  battlefield  and  the  importance  of  human  life,  all 
combat  systems  are  subject  to  ultimate  human  control  (Chen  and  Bames  2014; 
Endsley  2015b).  Therefore,  we  define  autonomous  software  agents  in  terms  that  are 
similar  to  the  role  that  humans  play  in  a  military  environment.  The  software  agent 
is  an  intelligent  nonhuman  agent  (IA)  that  has  clear  objectives,  able  to  monitor  its 
environment,  and  autonomous  in  the  sense  that  it  can  generate  courses  of  action 
(COAs)  to  obtain  its  objectives  (Russell  and  Novrig  2009).  However,  as  discussed 
in  the  following,  the  IA  is  always  subordinate  to  its  human  supervisor  much  as  a 
Soldier  is  subordinate  to  its  commander. 

Traditionally,  humans  and  automated  systems  have  been  assigned  to  separate 
functions,  but  recent  advances  pennit  a  fluid  relationship  approximating  human 
teaming  paradigms  (Lyons  and  Stokes  2012;  Cummings  2014).  Various 
technologies  are  being  designed  to  interact  and  communicate  with  human  operators 
to  ensure  decision-making  that  is  shared,  flexible,  and  still  human-centric  (Fisher 
et  al.  2007;  Goodrich  2010;  Goodrich  et  al.  2013;  Chen  and  Bames  2014).  The 
differences  in  the  following  techniques  involve  the  relative  roles  of  humans  and 
agents. 

Adaptive  systems  monitor  the  environment  or  the  operator’s  cognitive  state  for 
triggering  events.  During  overload  or  emergencies,  control  of  system  functions  are 
automated.  Similarly,  the  IA  software  relinquishes  control  to  the  operator  during 
normal  operations.  Human  control  is  ensured  by  an  operator-initiated  contract  prior 
to  the  mission  that  defines  decision  precedence  between  the  IA  and  the  operator 
(Parasuraman  et  al.  2007;  de  Visser  et  al.  2011). 

In  contrast,  adjustable  autonomy  (also  referred  to  as  adaptable  autonomy)  is 
automation  that  is  instantiated  at  the  discretion  of  the  human  during  the  mission.  It 
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may  take  the  form  of  “plays”,  which  are  predetermined  software  solutions  that  the 
operator  “calls”  during  the  mission  to  address  immediate  tactical  concerns  (Miller 
and  Parasuraman  2007). 

In  this  report  we  will  discuss  mixed-initiative  systems,  in  which  the  decision  space 
is  shared  between  IAs  and  human  operators  in  real  time.  Mixed-initiative  systems 
can  incorporate  both  adaptive  and  adjustable  software  as  part  of  the  joint  decision¬ 
making  process  (Hardin  and  Goodrich  2009,  Goodrich  2010;  Goodrich  et  al.  2013; 
Chen  and  Barnes  2014;  Bames  et  al.  2015). 

To  summarize,  we  are  discussing  mixed  initiative  autonomy  wherein  humans  and 
IAs  share  decision-making.  The  IA  has  a  degree  of  autonomy  and  communicates 
with  its  human  supervisor.  The  focus  of  this  report  is  on  the  decision-making 
relationship  between  IAs  and  its  human  supervisor(s)  in  future  battle  spaces  that 
are  dynamic,  dangerous,  and  complex  (Stone  2012).  We  will  discuss  human-agent 
teams  in  military  environments  as  they  relate  to  multiple  systems,  trust, 
transparency,  agent  architectures,  2-way  communications,  and  the  type  of 
interfaces  showing  current  progress  and  indicating  possible  future  research 
opportunities  (Stone  2012;  Cummings  2014;  Chen  and  Barnes  2014;  Barnes  et  al. 
2014;  Endsley  2015a). 

1.1  Military  Constraints 

The  use  of  autonomy  by  the  military  raises  special  issues  regarding  rules  of 
engagement  (rules  stipulating  the  circumstances  under  which  use  of  weapons 
systems  is  permitted)  beyond  those  involved  in  civilian  applications  (Defense 
Science  Board  2012;  Jentsch  and  Fincannon  2012).  Many  of  these  issues  center  on 
the  trade-off  between  the  utility  of  autonomy  and  its  lethality  (Singer  2010).  As  the 
development  of  technology  affords  new  capabilities,  there  is  ongoing  concern  that 
new  autonomous  capabilities  may  improve  combat  effectiveness  at  the  risk  of 
fratricide  and  civilian  casualties  (Bames  et  al.  2014;  Tiron  2003).  The  “fog  of  war” 
makes  accidents  inevitable,  but  at  least  initially  the  public  will  be  far  less  forgiving 
if  computer  errors  cause  fatalities  than  if  humans  make  the  same  mistakes.  Mica 
Endsley  (2015b),  in  her  role  as  a  Chief  Scientist  of  the  US  Air  Force,  stresses  that 
Department  of  Defense  (DOD)  Directive  3000.09  (2012)  mandates  safeguards  for 
autonomous  weapons,  as  shown  in  the  following: 

•  “Semi-autonomous  weapon  systems  that  are  onboard  or  integrated  with 
unmanned  platforms  must  be  designed  such  that,  in  the  event  of  degraded 
or  lost  communications,  the  system  does  not  autonomously  select  and 
engage  individual  targets  or  specific  target  groups  that  have  not  been 
previously  selected  by  an  authorized  human  operator.” 
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•  “The  system  design  .  .  .  addresses  and  minimizes  the  probability  or 
consequences  of  failures  that  could  lead  to  unintended  engagements  or  to 
loss  of  control  of  the  system.” 

•  “In  order  for  operators  to  make  informed  and  appropriate  decisions  in 
engaging  targets,  the  interface  between  people  and  machines  for 
autonomous  and  semi-autonomous  weapon  systems  shall:  (a)  be  readily 
understandable  to  trained  operators,  (b)  provide  traceable  feedback  on 
system  status,  and  (c)  provide  clear  procedures  for  trained  operators  to 
activate  and  deactivate  system  functions.” 

These  rules  reinforce  doctrine  that  military  decisions  are  human  responsibilities  and 
operators  must  have  a  clear-cut  understanding  of  the  consequences  of  supervising 
autonomous  systems.  Furthennore,  Endsley  (2015b)  concludes  that  autonomy  must 
be  integrated  into  the  force  structure.  This  entails  training  and  the  development  of 
interfaces  that  make  the  autonomy  understandable  to  ensure  human-centered 
systems.  Similarly,  the  Defense  Science  Board  (DSB)  (2012)  emphasizes  the 
importance  of  designing  autonomy  that  fits  into  the  network  of  military  capabilities 
so  as  not  to  introduce  “brittle”  systems  that  have  a  negative  impact  on  overall 
mission  effectiveness.  The  DSB  also  stresses  the  importance  of  enablers  of 
autonomy  such  as  naturalistic  interfaces  to  improve  collaboration,  trust,  and 
situation  awareness  (SA)  while  reducing  the  Soldier’s  physical  and  cognitive 
workload. 

1.2  Mixed-Initiative  Systems  and  a  General  Framework 

Human-agent  teaming  is  an  important  concept  because  it  implies  a  personal 
relationship  between  the  agent  and  the  human.  Ideally,  the  relationship  will  require 
bi-directional  communications  and  a  common  worldview.  IA  architecture  and  its 
ability  to  communicate  with  humans  is  still  primitive  but  it  is  progressing  beyond 
the  stage  of  literal  translations  and  moving  toward  interpretation  in  tenns  of  intent 
(Jurafsky  and  Martin  2009;  Lomas  et  al.  2012;  Wang  et  al.  2016).  Figure  1  is  a 
nominal  human-agent  framework  intended  to  provide  an  overview  of  the  issues  that 
are  discussed  in  the  rest  of  the  report. 
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Fig.  1  Characteristics  of  human-agent  shared  decision-making 

This  framework  will  be  used  to  motivate  discussions  of  the  various  features 
influencing  human-agent  shared  and  separate  decision  spaces.  The  advantage  of  a 
human-agent  partnership  is  that  each  element  has  its  own  strengths  and  weaknesses, 
and  together  they  have  the  potential  of  being  more  effective  than  the  sum  of  their 
parts  (Chen  and  Bames  2014).  For  example,  the  human  will  have  greater  meta¬ 
knowledge  of  political  implications  and  changing  strategic  objectives,  whereas  the 
agent  may  have  precise  algorithms  for  specific  technical  challenges.  On  the  other 
hand,  there  are  problems  in  combining  the  elements  into  a  cohesive  decision 
structure.  There  are  situations  when  human  trust  is  misaligned  with  the  agent’s 
reliability,  causing  humans  to  either  over-  or  under-trust  the  agent’s  decisions 
(Parasuraman  and  Riley  1997;  Dzindolet  et  al.  2003;  Lee  and  See  2004;  Beck  et  al. 
2007;  Mercado  et  al.  2016).  In  similar  fashion,  the  agent’s  world  model  might  be 
misaligned  with  the  operator’s  mental  model  and  could  misinterpret  the  intent  of 
operator’s  commands  (Chen  and  Bames  2014). 

The  user  interface  needs  to  be  transparent  so  that  agents  and  humans  understand 
each  other’s  reasoning  and  uncertainties  while  making  joint  decisions  (Lyons  and 
Havig  2014).  Creating  mutual  understanding  requires  calibrating  the  trust  of  the 
human  operator  and  providing  the  IA  with  an  ability  to  infer  human  intent  (Mercado 
et  al.  2016).  There  are  distinct  features  of  humans  such  as  affect  as  well  as  cultural 
norms  that  must  be  accounted  for  in  the  agent’s  behavioral  repertoire  (e.g.,  rules  of 
etiquette)  (Parasuraman  and  Miller  2004).  To  ensure  safe  operations,  it  is  important 
to  have  protocols  that  address  emergencies.  For  example,  adaptive  mechanisms 
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would  permit  the  IA  to  react  to  dangerous  situations  such  as  collision  avoidance 
without  waiting  for  operator  pennission.  Likewise,  humans  can  take  back  authority 
for  emergency  situations  (e.g.,  prevent  fratricide)  (Chen  et  al.  2010).  It  is  crucial  to 
define  control  procedures  that  are  flexible  but  are  consonant  with  operator’s  intent. 
Soldiers  are  encouraged  to  show  initiative  as  long  as  they  understand  the  intent  of 
their  original  orders. 

The  remainder  of  this  report  discusses  current  research  sponsored  by  the  US  Army 
Research  Laboratory  (ARL)  (predominantly  conducted  in  military  environments) 
that  addresses  the  following  5  issues:  1)  control  of  multiple  systems  using 
intelligent  agents,  2)  requirements  for  developing  transparency  and  appropriate 
trust  necessary  for  human-agent  interactions,  3)  agent  architectures  and  their 
implications  for  human-agent  communication,  4)  2-way  human-agent 
communications,  and  5)  naturalistic  interfaces  and  their  importance  for  efficient 
shared  decision-making.  In  addition,  we  discuss  areas  of  future  research  and  the 
shortcomings  of  our  current  understanding  of  human-agent  shared  decision¬ 
making. 

1.3  The  Conundrum  of  Control 

As  mentioned  previously,  autonomy  implies  that  the  agent  controls  its  own  actions, 
but  in  a  seeming  contradiction  we  argue  that  ultimate  control  resides  with  the 
human  operator.  In  the  mixed  initiative  paradigm,  it  is  necessary  to  develop 
protocols  that  dictate  when  the  human,  the  agent,  or  both  (collaborative)  have 
decision  precedence.  The  protocols  can  be  mission-specific  or  be  general  in  nature. 
However,  in  rapidly  changing  environments,  human  concurrence  with  agent 
decisions  may  not  be  practical.  This  is  particularly  true  for  multi-system  control 
where  the  number  of  elements  and  the  difficulties  of  controlling  each  element 
makes  effective  supervision  difficult,  if  not  impossible  (Miller  1956;  Lewis  and 
Wang  2010;  Schulte  and  Meitinger  2010,  Chen  et  al.  2011;  Lewis  2013).  Metrics 
such  as  neglect  time  (time  estimates  of  when  supervisory  attention  is  not  needed 
for  specific  agents)  and  interaction  time  (average  time  that  an  operator  needs  to 
interact  with  an  agent)  are  only  useful  if  scheduling  of  attention  by  n-supervisors 
monitoring  n-elements  is  predictable  (Goodrich  2010;  Goodrich  et  al.  2013). 
Combat  by  its  nature  is  volatile  and  uncertain,  making  scheduling  impractical  in 
many  situations. 

A  variety  of  strategies  have  proven  effective  in  enhancing  mixed -initiative 
decision-making.  Some  cognitive  tasks  are  more  amenable  to  automation  than 
others.  For  example,  information  filtering/selection  appears  to  be  a  good  candidate 
for  automation  algorithms,  but  selecting  an  action  with  important  consequences 
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usually  requires  human  oversight  (Parasuraman  et  al.  2000).  There  are  also  cases 
where  autonomy  can  be  assigned  to  various  functions  that  are  housekeeping 
(convoy  separation)  or  because  of  time  constraints  (collision  avoidance)  (Wright 
et  al.  2014).  In  other  time-constrained  situations,  decision  authority  can  be  divided. 
For  an  incoming  missile,  an  operator  has  the  ability  to  override  autonomy  until  a 
critical  time  limit  is  reached,  after  which  missile  defense  systems  kick  in 
automatically  (Parasuraman  et  al.  2007).  In  still  other  situations,  such  as  identifying 
the  importance  of  specific  objects,  the  IA  may  detect  an  object  but  defer  to  the 
human  to  assess  its  significance  (Jentsch  and  Fincannon  2012;  Bames  et  al.  2014). 
Autonomy  needs  to  be  flexible;  authority  resides  with  the  operator  but 
circumstances  may  require  the  IA  to  take  the  initiative. 


The  DSB  (2012)  suggested  that  autonomy  will  be  particularly  useful  when  an 
operator  must  interact  with  multiple  assets.  A  caveat  is  that  the  number  of  assets 
controlled  must  be  limited  to  that  which  can  be  managed  effectively  by  a  single 
human  (Lewis  and  Wang  2010).  To  minimize  control  issues,  ARL  researchers  are 
investigating  an  IA  (Section  2)  that  acts  as  an  intermediate  supervisor  by 
monitoring  subordinate  systems  and  by  suggesting  COA  changes  when  unexpected 
events  occur  during  the  mission  (Chen  et  al.  2011;  Chen  and  Bames  20 12a, b;  Chen 
and  Bames  2014).  However,  as  Fig.  2  illustrates,  hierarchical  networks  of  agents 
can  be  expanded  to  enlist  multiple  local  agents  that  interact  with  supervisory  agents 
who  in  turn  interact  with  the  human  operator.  These  paradigms  use  network 
technology  with  the  human  operator  at  the  apex  to  reduce  the  problem  space  to 
manageable  proportions  without  abrogating  human  decision  authority  (Hou  et  al. 
2011;  Chen  and  Bames  2014).  Multi-agent  paradigms  also  have  the  advantage  of 
being  able  to  reconfigure  the  network  as  either  the  mission  changes  or  an  agent 
becomes  disabled  (DSB  2012). 


1  1 
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Multiple  robots 


Single  operator  multiple  supervisor  agents 
Multiple  robots 


Fig.  2  Control  structures  for  human  agent  teams.  Robots  without  tools  are  supervisor  robots, 
while  robots  with  tools  at  their  base  are  operational  robots. 
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2.  RoboLeader  and  Human-Agent  Control  Processes 


RoboLeader  is  an  ARL  research  paradigm  investigating  the  human  performance 
implications  of  instantiating  an  IA  that  supervises  multiple  systems  and  in  turn  is 
supervised  by  a  human  operator  (Chen  et  al.  2010).  RoboLeader  researchers 
simulated  various  missions  such  as  surveillance,  policing,  and  convoy  operations. 
Over  the  course  of  numerous  studies,  IA  error  type  and  rate,  task  difficulty,  degree 
of  autonomy,  individual  differences,  agent  transparency,  and  type  of  multitasking 
were  investigated  (Barnes  et  al.  2011;  Chen  and  Bames  20 12a, b;  Wright  et  al.  2014; 
Chen  et  al.  2016).  The  most  recent  studies  simulated  operators  engaged  in  convoy 
operations  in  which  they  are  supervising  an  unmanned  ground  vehicle  (UGV),  an 
unmanned  aerial  vehicle  (UAV),  and  a  manned  ground  vehicle  while  conducting 
360°  threat  monitoring  around  their  own  vehicle  (Wright  et  al.  2013,  2016  in  press. 
The  IA  made  convoy  route  change  suggestions  when  unanticipated  events  occurred 
during  the  mission  (Fig.  3).  The  results  indicated  the  effects  of  varying  reasoning 
information;  succinct  text  explanations  helped  the  operator  reduce  misuse  of 
automation  whereas  supplying  superfluous  information  hurt  performance  (Wright 
et  al.  2016  in  press).  Multiple  studies  using  the  RoboLeader  paradigm  resulted  in  a 
better  understanding  of  IAs  contributions  to  military  decision-making  during 
manned/unmanned  operations.  Individual  differences  in  gaming  experience,  spatial 
abilities,  and  an  individual’s  confidence  in  attentional  control  proved  to  be 
ubiquitous  factors  in  human-agent  interactions,  implying  that  training  and  decision 
support  should  be  geared  to  individual  aptitudes  and  experience  rather  than  “one 
size  fits  all”  solutions  (Chen  and  Barnes  2011,  2014). 
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1.  Map  and  Route 
Overview 


2.  Roboleader 
Communications  Window 


7.  UAV  Camera 
Feed 


4.  MGV  Forward 
Camera  Feed 


6.  UGV 
Camera  Feed 


Camera  Feed 


Fig.  3  RoboLeader  interface  for  convoy  operations  (Wright  et  al.  2014) 


3.  Trust  and  Transparency:  Situation-Based  Agent 
Transparency  Model 

Trust  is  an  important  research  topic  for  both  automated  and  autonomous  systems 
because  it  mediates  between  the  reliability  of  such  systems  and  operators’  decisions 
to  use  them  (Wickens  1994;  Lyons  and  Havig  2014).  Lee  and  See  (2004)  defined 
appropriate  trust  as  human  reliance  on  automation  that  minimizes  disuse  (failure  to 
rely  on  reliable  automation)  and  misuse  (over-relying  on  unreliable  automation) 
(Parasuraman  and  Riley  1997;  Parasuraman  and  Manzey  2010;  de  Visser  et  al 
2012).  Trust  can  be  measured  either  as  an  attitude  (subjective  measure)  or  as  a 
behavior  (misuse  and  disuse)  (Lyons  and  Stokes  2012;  Meyer  and  Lee  2013). 
Furthermore,  trust  can  be  a  predisposition  of  the  operator  (trait)  or  depend  on 
specific  circumstances  (state)  (Schaeffer  and  Scribner  2015;  Schaeffer  et  al.  2015). 
Subjective  ratings  have  been  shown  to  correlate  with  automation  reliability, 
perception  of  the  IA  capabilities,  and  task  difficulties  as  well  as  individual 
differences  (Hancock  et  al.  2011;  Schaeffer  and  Scribner  2015;  Schaeffer  et  al. 
2015).  Lee  (2012)  suggests  that  in  order  to  make  the  underpinnings  of  the 
automation  algorithms  transparent,  the  operator  must  be  able  to  understand  their 
purpose,  process,  and  performance.  Based  on  these  and  related  concepts,  ARL 
researchers  (Chen  et  al.  2014;  Chen  and  Barnes  2015)  developed  the  SA-based 
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Agent  Transparency  (SAT)  model  (Fig.  4)  to  elucidate  aspects  of  SA  affecting  trust 
(Endsley  1995,  2015b).  SAT  posits  3  transparency  levels  (Ls)  of  infonnation  to 
support  the  operator’s  understanding  of  the  IA’s  decision  process:  LI)  operator 
perception  of  the  IA’s  actions  and  plans,  L2)  comprehension  of  the  IA’s  reasoning 
process,  and  L3)  understanding  of  the  IA’s  predicted  outcomes.  The  purpose  of  the 
SAT  model  is  to  define  the  type  of  infonnation  necessary  to  give  the  operator 
insight  into  the  IA’s  intent,  logic,  and  the  perceived  likelihood  of  obtaining  its  end 
state. 


•  To  support  operator's  Level  1  SA  (What's  going  on  and  what 
is  the  agent  trying  to  achieve?) 

•  Purpose 

•  Desire  (Goal  selection) 

•  Process 

•  Intentions  (Planning/Execution) 

•  Progress 

•  Performance 

•  To  support  operator's  Level  2  SA  (Why  does  the  agent  do  it?) 

•  Reasoning  process  (Belief)! Purpose) 

•  Environmental  &  other  constraints 

•  To  support  operator’s  Level  3  SA  (What  should  the  operator 
expect  to  happen?) 

•  Projection  to  Future/End  State 

•  Potential  limitations 

•  Uncertainty;  Likelihood  of  error 

•  History  of  performance 

Fig.  4  Features  of  the  SAT  model  of  agent  transparency  (Chen  et  al.  2014) 


3.1  Autonomy  Research  Pilot  Initiative  and  Agent  Transparency 
Research 

The  US  DOD  Autonomy  Research  Pilot  Initiative  (ARPI)  program  is  funded  by  the 
DOD  to  investigate  the  effects  of  autonomy  in  military  environments  and  develop 
implementation  practices  based  on  acquired  knowledge.  The  research  is  far-ranging 
and  involves  multiple  programs.  Two  programs  that  ARL  is  involved  in — 
Intelligent  Multi-UxV  Planner  with  Adaptive  Collaborative/Control  Technologies 
(IMPACT)  and  Autonomous  Squad  Member  (ASM) — are  discussed  as  exemplars 
of  mixed  initiative  systems.  Because  of  their  realistic  nature,  time  constraints,  and 
complexity,  the  ARPI  projects  are  ideal  platforms  for  evaluating  operator  trust  and, 
in  particular,  the  efficacy  of  SAT  model  constructs.  The  IMPACT  project  is  a  tri¬ 
service  program  that  investigates  shared  decision-making  between  multiple 
intelligent  systems  and  the  human  operator  for  a  base  defense  scenario  in  a  littoral 
environment  (Draper  2014).  The  operator’s  role  is  to  plan  and  supervise  a  mission 
with  aerial,  ground,  and  naval  unmanned  vehicles  (UVs)  that  respond  to  suspicious 
activities  related  to  base  security  (e.g.,  encroachments  on  the  shore  line).  In  typical 
mission  scenarios,  the  base  defense  coordinator  suggests  an  initial  plan  objective 
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referred  to  as  a  “play”  (Miller  and  Parasuraman  2007).  Then  the  IA/planner  chooses 
available  assets  (UVs  with  sensor  options)  and  generates  the  optimal  route  for  the 
chosen  platfonns  to  attain  the  objective.  During  the  mission,  machine  learning 
(ML)  algorithms  monitor  progress.  However,  operators  have  final  executive 
authority;  they  are  able  to  tweak  the  plan  or  choose  an  option  other  than  the  agent’s 
preferred  option.  The  objective  of  the  ASM  project  is  to  investigate  capabilities  of 
a  robotic  asset  to  support  squad-level  perfonnance  during  infantry  missions  (Chen 
and  Bames  2015;  Selkowitz  et  al.  2016  in  press).  The  ASM  robot  responds 
autonomously  by  adapting  to  squad  tactics.  Here  again,  Soldiers  are  the  final 
arbitrators  of  the  ASM  decisions  but  are  constrained  by  the  fact  that  the  ASM  must 
respond  rapidly  to  perturbations  during  the  mission. 

3.2  Transparency  and  Trust:  ARPI  Results 

Mercado  et  al.  (2016)  tested  implications  of  the  SAT  model  simulating  a  simplified 
version  of  the  IMPACT  interface  and  missions.  The  experiment  consisted  of  24 
simulated  IMPACT  scenarios  (counterbalanced  among  participants)  broken  down 
into  8  scenarios  for  each  SAT  condition  (LI,  Ll+2,  and  Ll+2+3).  Two  COAs,  A 
and  B,  were  displayed  for  each  scenario  with  option  A  being  the  IA’s  preferred 
COA.  During  the  experiment,  alerts  were  given  to  participants  that  supported  option 
A  in  63%  of  the  scenarios  and  B  in  37%.  Perfonnance  improved  as  a  function  of 
increasing  SAT  levels;  misuse  and  disuse  both  decreased  for  the  2  higher 
transparency  level  conditions.  Furthennore,  subjective  measures  of  trust  increased 
for  SAT  Level  1+2+3,  showing  that  attitude  as  well  as  perfonnance  were  positively 
affected  when  the  operator  was  given  projected  outcome  information  (Meyer  and 
Lee  2013).  Unlike  previous  research,  neither  workload  nor  response  latency 
measures  were  degraded  by  the  additional  information  comprising  higher  SAT 
levels  (Helldin  2014).  Of  particular  note  was  the  finding  that  portraying  uncertainty 
infonnation  in  the  L3  conditions  improved  the  participants’  subjective  trust 
(Mercado  et  al.  2016). 

In  a  follow-on  study,  uncertainty  (U)  was  parsed  from  L3  and  the  experimental 
conditions  included  Ll+2  vs.  Ll+2+3  vs.  L1+2+3+U  (Stowers  et  al.  2016).  Overall, 
uncertainty  information  improved  operator  performance.  However,  while  the 
improvement  in  percentage  correct  was  most  evident  in  correct  rejections  (rejecting 
the  suggested  COA  when  the  alternative  was  correct),  Fig.  5  indicates  that  proper 
use  (choosing  the  IA  suggestion  when  it  was  correct)  followed  the  same  trend. 
Unlike  the  first  experiment,  measuring  participants  using  subjective  trust  scales  did 
not  show  significant  improvements  when  uncertainty  information  was  displayed  to 
the  operator  (Chen  et  al.  2016;  Stowers  et  al.  2016). 
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Fig.  5  Percentage  correct  data  for  levels  of  the  SAT  model  in  the  second  experiment  (Stowers 
et  al.  2016) 

For  a  different  task  environment  involving  autonomous  squad  members  (small 
robots),  the  results  showed  that  adding  uncertainty  information  (LI +2+3)  did  not 
improve  subjective  trust,  whereas  reasoning  cues  (LI +2)  improved  trust  over 
baseline  conditions  (LI)  (Chen  and  Barnes  2015).  A  follow-on  experiment  showed 
improvements  in  SA  and  trust  when  prediction  information  was  available  but 
showed  no  advantage  to  adding  uncertain  information  (Selkowitz  et  al.  2016  in 
press).  The  experiments  verified  the  main  tenets  of  the  SAT  model.  That  is,  the 
operator  was  better  able  to  adjust  the  agent’s  plan  based  on  environmental  or 
tactical  changes  because  of  the  insights  afforded  by  transparency  information 
(Mercado  et  al.  2016).  The  efficacy  of  adding  uncertainty  to  prediction  was  unclear 
especially  in  the  ASM  environment  (Selkowitz  et  al.  2016  in  press).  This  is  at 
variance  with  other  research  that  found  that  portraying  uncertainty  improved  proper 
reliance  on  automated  systems  in  a  variety  of  environments.  These  findings  indicate 
that  more  research  is  necessary  to  determine  in  which  environments  and  for  which 
display  formats  displaying  uncertainty  is  beneficial  (Bass  et  al.  2013;  Bisantz  2013; 
Chen  et  al.  2014;  Helldin  2014). 

4.  Team  Communications 


Besides  understanding  the  IA’s  decision  processes,  humans  will  need  to  interact 
with  the  agent  as  a  team  member  to  achieve  effective  shared  decision-making 
(Green  et  al.  2008).  Teams  are  defined  as  2  or  more  entities  that  collaborate  (share 
decision-making)  and  coordinate  (synchronize  tasks)  to  accomplish  common  goals. 
Especially  in  dynamic  environments,  an  effective  team  requires  compatible 
knowledge  structures  and  the  ability  to  communicate  (Morrow  and  Fiore  2013). 
Both  characteristics  assume  transparency  among  team  members  to  the  extent  that 
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decisions  and  actions  among  team  members  are  mutually  understood.  Knowledge 
structures  can  be  shared  mental  models  or  specialized  models  by  individual 
members,  but  the  intent  of  individual  team  members  must  be  communicated  to 
other  team  members  for  effective  joint  action.  In  the  case  of  teams  of  IAs  and 
humans,  the  underlying  knowledge  structures  may  be  quite  different  from  those 
typically  found  in  human-human  teams.  For  example,  in  the  IMPACT  architecture, 
the  IA  processing  is  opaque  to  the  operator  but  the  resultant  options  can  be 
graphically  compared  in  terms  of  the  trade-offs  among  the  different  outcomes  (fuel 
consumption,  time  to  target,  etc.)  indicating  the  IA’s  intended  end  state  for  each  of 
the  options  (Behymer  et  al.  2015;  Chen  et  al.  2016a, b). 

Agent  architectures  are  still  a  matter  of  considerable  research  interest  (Chen  and 
Barnes  2014).  We  discuss  the  trade-off  between  processing  efficiency  and 
transparency  for  shared  problem-solving  in  subsequent  sections.  Unlike  a  human- 
human  teaming  relationship,  natural  dialogue  is  still  difficult  for  unanticipated  or 
unprogramed  situations.  However,  progress  in  developing  cognitive  knowledge 
structures  and  natural  language  processing  is  making  human-like  interactions  with 
agents  increasingly  likely  for  the  near  future  (Lomas  et  al.  2012;  Economist  2016). 

4.1  ARL  Robotic  Collaborative  Technology  Alliance  and 
Computational  Cognitive  Models 

Cognitive  models  such  as  SOAR  and  ACT-R  using  rule-based  systems,  neural  nets, 
and  ML  approaches  were  developed  to  model  human  information  processing 
capabilities  (Laird  et  al.  2011).  Recently,  cognitive  models  have  been  used  to 
develop  architectures  to  improve  the  robot’s  capacity  to  navigate  in  real-world 
environments  and  solve  problems  such  as  finding  a  doorway,  locate  a  particular 
item  in  a  room,  or  simulate  an  IA  that  acts  as  a  surrogate  crewmember  (Ball  et  al. 
2010;  Kelley  and  McGhee  2013;  Chen  and  Bames  2014).  There  are  a  number  of 
advantages  to  using  cognitive  approaches  as  opposed  to  purely  algorithmic 
solutions.  One  is  that  cognitive  models  emulate  a  system  that  is  adaptive  and  has 
proved  successful  in  complex  environments  (i.e.,  the  human  brain),  and  another  is 
that  the  similarities  between  the  model’s  knowledge  representations  and  human 
cognition  should  make  transfer  of  infonnation  easier  between  agents  and  humans. 
ARL’s  Robotic  Collaborative  Technology  Alliance  has  used  a  “find  the  backdoor” 
scenario  both  to  develop  the  knowledge  structures  and  the  language  processing 
required  to  control  a  small  robot  using  simple  commands  to  find  a  designated  door. 

Kelley  and  colleagues  (e.g.,  Kelley  and  McGhee  [2013])  have  developed  the  Sub- 
Symbolic  Robotics  Intelligence  System  cognitive  processing  model  to  control 
robotic  autonomous  behaviors.  Kelley  and  McGhee  used  the  concept  of  episodic 
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memories  to  emulate  cognitive  processes  such  as  using  memory  streams  based  on 
past  experience  (episodic  memory)  to  address  problems  such  as  finding  a  back  door 
in  a  building  that  required  the  robot  to  navigate  around  obstacles.  The  robotic  IA 
builds  software  memories  that  are  novel  and  those  that  are  boring.  The  latter  are 
memories  that  have  little  infonnation  because  they  do  not  change  over  time.  The 
robotic  agent  combines  the  2  types  to  remember  when  a  boring  memory  transitions 
to  a  novel  event  in  order  to  build  a  cognitive  map  to  the  correct  back  door  solution. 
Other  ARL  research  has  used  metaphors  such  as  dreaming  for  knowledge 
consolidation  and  software  constructs  to  represent  emotions  and  temperament  to 
make  the  robotic  agent  more  accessible  to  its  human  teammate  (Kelley  2014;  Long 
et  al.  2015).  However,  there  is  no  reason  to  base  agent  intelligence  solely  on 
cognitive  models;  there  are  numerous  useful  agent  technologies  that  are  based  on 
computational  logic  that  solve  specific  problems  efficiently  (Fisher  et  al.  2007). 
Optimization  algorithms  such  as  Simultaneous  Localization  and  Mapping  have 
been  used  successfully  for  Anny  problems  such  as  using  robots  to  find  the  location 
of  objects  in  buildings  that  would  be  unsafe  for  Soldiers  perfonning  the  same 
function  (Barnes  et  al.  2014).  Whatever  the  IA’s  knowledge  representation,  it  must 
translate  to  fonnats  that  humans  understand  to  help  foster  a  common  language 
between  the  IA  and  its  operator. 

4.2  Language  Processing 

Although  compatibility  of  knowledge  structures  is  important  for  communication, 
compatibility  by  itself  is  not  sufficient  for  2-way  communications.  Language  does 
not  require  either  text  or  spoken  dialogue;  it  does  require  syntax,  semantics,  and 
pragmatics  to  convey  meaning  during  2-way  communications.  Pragmatics  assure 
that  the  communication  is  appropriate  for  the  intended  environment.  For  example, 
“go  to  the  bank”  has  a  different  interpretation  depending  on  whether  the  dialogue 
refers  to  natural  surroundings  or  financial  transactions  (Jurafsky  and  Martin  2009). 
Thus  natural  dialogue  is  sensitive  to  nuance  and  intention  and  not  simply  its  literal 
translation  (Hoare  and  Parker  2010).  This  makes  open-ended  natural  language 
processing  (NLP)  difficult  and  possibly  impractical  for  some  combat  environments 
(Harris  and  Barber  2014).  Chen  and  Bames  (2014)  identified  3  gradations  of 
language  processing:  command  processing,  controlled  language  processing,  and 
open-ended  NLP.  The  levels  vary  both  in  the  size  of  their  lexicons  and  the 
underlying  sophistication  of  the  software.  As  opposed  to  open-ended  NLP, 
command  and  controlled  language  processing  are  both  attuned  to  specialized 
tasking  environments. 
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4.2.1  Command  Processing 

In  practice,  the  command  and  the  controlled  processing  strategies  may  overlap. 
However,  controlled  algorithms  are  geared  to  complex  missions  while  command 
lexicons  have  been  used  successfully  for  limited  task  repertoires  such  as  selecting 
menu  options  or  controlling  robotic  movements  (“move  to  the  north  of  x”).  The 
advantage  of  command  processing  is  its  simplicity  and  the  ease  in  which  its 
operators  are  able  to  assimilate  its  lexicon  after  limited  training  (Pettit  et  al.  2013; 
Barber  et  al.  2014).  The  lexicon  is  not  limited  to  verbal  utterances;  successful 
interactions  between  humans  and  small  robots  have  been  demonstrated  using 
gesture  and  tacton  commands  as  well  (Barber  et  al.  2013,  2015).  Harris  and  Barber 
(2014)  investigated  various  commercial  off-the-shelf  language  processors  to 
interpret  speech  for  a  limited  domain  lexicon  for  commanding  small  robots.  The 
type  of  audio  sensor  and  lexicon  constraints  influenced  accuracy  rate,  which  in 
general  was  fairly  high  (70%-80%).  However,  if  the  available  online  lexicon  was 
much  larger  than  the  lexicon  required  to  command  the  robot,  they  found  the 
accuracy  rate  to  be  quite  low  (5%).  The  most  likely  reason  was  the  increased 
likelihood  of  confusion  between  like-sounding  utterances  as  the  lexicon  sized 
increased  and  became  more  open-ended.  To  standardize  commands  to  the  robot, 
recent  efforts  have  focused  on  developing  lexicons  that  represent  Soldier  speech 
patterns  under  realistic  conditions  (Barber  et  al.  2014),  thus  creating  an  easily 
learned  lexicon  of  moderate  dimensions. 

4.2.2  Controlled  Processing 

More-sophisticated  inference  engines  are  needed  for  processors  that  enable  2-way 
communications  that  go  beyond  simple  commands.  Apple’s  “Siri”  and  other 
commercial  products  indicate  limited  dialogues  are  possible  with  current 
technology.  Also,  because  of  their  commercial  potential,  2-way  communication 
software  will  continue  to  improve,  enabling  human-agent  interaction  to  mimic 
human-to-human  dialogues  in  the  near  future.  True  dialogues  involve  intent 
inferencing  and  back  and  forth  querying  (such  as,  “Is  this  an  object  you  wish  me  to 
investigate?”  “No  it  is  too  oblong — check  to  the  left  about  10  meters.”)  (Duplessis 
and  Deviller  2015).  Early  agent  architectures  such  as  Belief-Desire-Intention 
modeled  the  agent’s  processes  in  terms  of  beliefs  (understanding  of  the 
environment),  desires  (objectives),  and  intentions  (plan  to  achieve  objectives)  to 
capture  an  agent’s  human-like  qualities  (Rao  and  Georgeff  1995;  Chen  and  Barnes 
2014).  Two  recent  ARL-sponsored  projects  demonstrate  progress  toward  more- 
mature  language  inferencing  in  military  environments  (Giammanco  et  al.  2015; 
Wang  et  al.  2016). 
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Controlled  English  (CE)  is  a  specialized  natural  language  representation  developed 
by  International  Technology  Alliance  scientists  from  ARL  and  the  United  Kingdom 
(Giammanco  et  al.  2015;  Xue  at  al.  2015).  CE  can  be  written  by  humans  in  English 
that  is  then  translatable  into  a  machine-readable  format  using  domain  semantics  and 
predicate  logic  inferencing.  Specifically,  the  user’s  conceptual  model  is  written  in 
CE  as  logical  inference  rules  representing  their  relationships  to  specific  military 
and  civilian  environments  (Giammanco  et  al.  2015).  For  example,  CE  algorithms 
for  military  intelligence  applications  learn  from  interacting  with  intelligence 
analysts  during  real-world  military  vignettes.  The  interactions  result  in  the  ability 
of  the  CE  to  make  sophisticated  inferences  concerning  intelligence  processes 
mimicking  a  human  partner  performing  the  same  function  (Mott  et  al.  2015).  The 
crucial  difference  between  CE  and  simpler  command  processes  is  the  ability  of  CE 
algorithms  to  infer  the  meaning  of  environmental  and  situational  cues. 

Wang  et  al.  (2016)  simulated  a  self-explanatory  agent  that  made  its  intentions  and 
reasoning  transparent  to  the  operator  using  text-based  messages  during  a  simulated 
human-robot  interaction  task.  The  explanations  were  based  on  both  stochastic 
reasoning  by  the  agent  and  inferences  about  the  human’s  preferences.  Wang  et  al. 
described  the  algorithmic  process  as  the  following: 

Decision-theoretic  planning  provides  an  agent  with  quantitative  utility  calculations 
that  allow  agents  to  assess  trade-offs  between  alternative  decisions  under 
uncertainty.  Recursive  modeling  gives  the  agents  a  theory  of  mind  (Whiten  1991), 
allowing  them  to  form  beliefs  about  the  human  users’  preferences,  factor  those 
preferences  into  the  agent’s  own  decisions,  and  update  its  beliefs  in  response  to 
observations  of  the  user’s  decisions. 

Wang  et  al.’s  agent  not  only  provides  information  about  its  own  reasoning  process, 
but  the  IA  attempts  to  understand  the  preference  structure  of  its  human  teammate. 
Preliminary  results  suggest  that  participants  were  sensitive  not  only  to  the  ability  of 
the  robot  (its  accuracy),  but  also  to  whether  an  agent  generated  an  explanation  for 
its  action.  Humans  perfonned  better  (reduced  misuse)  and  appropriately  trusted 
even  low-ability  robots  (reduced  disuse)  more  often  if  participants  understood  the 
basis  of  the  robot’s  decision.  Future  objectives  include  generating  2-way  dialogues 
based  on  the  inferencing  and  language  processing  abilities  of  the  IA  initially  in 
simulation  and  eventually  during  exercises. 

4.3  Graphic-  and  Video-Mediated  Communications 

In  a  field  environment,  communications  using  chat  or  voice  may  not  be  as  efficient 
as  graphic  or  video  representations.  Researchers  at  Ben-Gurion  University  (Oron- 
Gilad  2014)  have  collaborated  with  ARL  to  investigate  the  use  of  video  feeds  to 
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individual  operators  from  both  UGVs  and  UAVs  surveilling  possible  insurgent 
activities  for  both  stationary  (a  safe  house)  and  mobile  (car)  targets.  The  results  of 
the  studies  show  distinct  advantages  to  having  video  feeds  from  both  ground  and 
aerial  views  because  of  the  advantages  of  giving  the  operator  various  perspectives 
of  the  ongoing  mission  (Oron-Gilad  et  al.  2011;  Ophir-Arbelle  et  al.  2013;  Barnes 
et  al.  2014). 

In  follow-on  research,  Oron-Gilad  and  colleagues  (2014)  simulated  bi-directional 
communication  between  a  UAV  and  ground  forces.  A  human  played  the  role  of  the 
IA  and  human  surrogates  played  the  role  of  the  military  commanders  directing  a 
UAV.  The  purpose  of  the  experimentation  was  to  emulate  graphic  representations 
that  can  be  updated  by  the  UAV  crew  or  its  software  and  annotated  by  the  ground 
commander  (e.g.,  “Go  to  target  X  next”)  during  surveillance  missions.  In  a  recent 
simulation  experiment  (Oron-Gilad  2014),  9  participants  with  Israeli  Defense  Force 
(IDF)  combat  experience  played  the  roles  of  the  ground  commanders  interacting 
with  a  UAV  operator.  The  UAV  operator  emulated  a  bi-directional  agent  capable 
of  communicating  with  the  mission  commander  by  annotating  imagery  during  the 
mission.  Figure  6  shows  2  bi-directional  graphics  generating  updates  from  the 
surrogate  commanders  and  the  UAV  crewmember.  The  graphic  on  the  left 
displayed  important  intelligence  indicators  that  are  annotated  in  real  time,  and  the 
display  refreshed  itself  as  the  mission  progressed  (series  of  static  images).  The 
graphic  on  the  right  showed  images  with  anchor  points  indicating  important  map 
indicators  as  the  mission  progressed. 


Fig.  6  Bi-directional  communications  between  IDF  commanders  and  a  UAV  using  static 
(snapshots)  and  anchor  (permanent  landmarks)  graphics 

The  participants  found  that  combination  of  transitory  snapshots  and  anchor  points 
were  viable  sources  of  tactical  information  using  bi-directional  graphics  as  an 
interaction  tool.  Future  research  will  investigate  more  complex  missions  using 
UGV  as  well  as  UAV  videos.  In  subsequent  experiments,  enhanced  bi-directional 
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interfaces  using  audio  and  tactile  cueing  will  create  a  richer  source  of  mission 
information.  Eventually,  the  research  plans  include  an  IA  with  limited  language 
processing  abilities  to  replace  the  UAV  operator  in  the  bi-directional  experiment. 
In  a  related  effort  (McDermott  et  al.  2015),  the  US  Army  is  developing  advanced 
visualization  and  analytics  tools  to  autonomously  search  for  and  annotate  videos 
with  high-value  intelligence.  The  system  developers  are  in  the  process  of 
integrating  analytics  such  as  face  recognition  tools  and  developing  interfaces  that 
allow  the  operator  to  query  the  system  for  2-way  interactions. 

4.4  Summary  of  Teaming  Requirements 

Human-agent  teams  require  humans  to  have  insight  into  the  IA’s  decision  process 
and  vice-versa  (Chen  and  Barnes  2015).  However,  the  IA  must  understand  the 
implications  of  the  human  intentions  that  go  beyond  understanding  the  literal 
communications  between  team  members  (Hoare  and  Parker  2010).  Bi-directional 
communications  also  require  the  give  and  take  of  normal  conversations  as  each 
member  of  the  team  queries  its  teammate  concerning  the  ongoing  missions.  The 
more  closely  the  knowledge  structures  of  the  human-agent  team  are  aligned,  the 
more  the  necessity  for  extensive  dialogue  is  ameliorated  (Chen  and  Barnes  2014). 
This  is  important  because  military  operations  have  additional  constraints  to 
minimize  communications  and  to  develop  interfaces  that  are  lightweight,  quiet, 
easy  to  use,  and  non-observable  (Barnes  et  al.  2014). 

5.  Naturalistic  Interfaces 


Shared  decision-making  interfaces  will  require  advanced  concepts  to  adhere  to 
combat  constraints  especially  for  small-unit  ground  forces  such  as  the  ASM 
paradigm  (Chen  and  Barnes  2015).  A  combination  of  multisensory  interfaces 
improves  the  Soldier’s  ability  to  adjust  to  multiple  situations  such  as  the  necessity 
for  radio  silence,  day  and  night  missions,  and  eyes-forward  and  hands-free.  Hill 
et  al.  (2015)  demonstrated  the  utility  of  controlling  small  robots  using  multiple 
control  devices  (stylus,  voice,  and  glove)  during  an  Army  field  experiment  in  2014 
(Fig.  7).  The  diversity  of  both  input  and  display  devices  enabled  communications 
with  robots  under  a  variety  of  field  conditions. 
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Fig.  7  Multiple  robotic  control  devices  used  in  field  exercise  (Hill  et  al.  2015) 

Elliott  and  her  ARL  colleagues  (Elliott  et  al.  2010,  2015)  have  investigated  voice, 
gesture,  and  tactile  communications  to  develop  naturalistic  interfaces  that  enhance 
the  ability  of  the  operator  to  control  robots.  The  advantages  of  these  interfaces  are 
that  Soldiers  are  not  burdened  by  having  to  look  down  at  a  display,  hands  are  freed 
to  carry  weapons,  and  Soldiers  can  be  signaled  covertly  concerning  both  their  own 
and  the  robot’s  location.  In  a  field  experiment  at  Fort  Benning,  Georgia,  Soldiers 
using  a  tactile  vest  were  able  to  respond  more  rapidly  and  accurately  than  using 
traditional  signaling  methods  during  a  reconnaissance  mission  (Elliott  et  al.  2016). 
For  night  missions  in  particular,  tactile  vests  proved  an  important  navigation  aid 
(Pomranky-Hartnett  et  al.  2015).  These  findings  reinforce  the  corpus  of  tactile 
research  indicating  the  robustness  of  tactile  communication  under  a  variety  of 
conditions  (Van  Erp  2007;  Jones  and  Sarter  2008;  Elliott  et  al.  2009;  Barnes  et  al. 
2014;  Barber  et  al.  2015;  Elliott  et  al.  2015). 

The  technology  for  gesture  control  systems  is  undergoing  rapid  development  with 
organizations  exploring  multiple  options  because  of  the  perceived  commercial 
benefit  of  these  devices  (Elliott  et  al.  2016).  The  2  principal  approaches  are  camera- 
based  and  instrumented  gloves  (wireless).  Both  approaches  assume  the  operator  is 
able  to  signal  unambiguously,  and  both  types  depend  on  algorithms  embedded  in 
the  asset  (e.g.,  Hidden  Markov  Models)  to  disambiguate  signals.  The  results  related 
to  gesture  control  are  mixed.  For  example,  Soldiers  in  Elliott  et  al.’s  (2015) 
experiment  rated  the  utility  of  instrumented  gloves  highly  but  were  more  accurate 
using  the  tablet  display  for  robot  control.  Currently,  various  problems  with  gesture 
control  have  been  noted.  The  systems  are  too  bulky  to  be  practical,  signaling  by  the 
operator  can  be  difficult,  and  there  are  security  issues  related  to  wireless 
transmissions  (Elliott  et  al.  2016).  Gesture  control’s  most  likely  use  will  be  in 
conjunction  with  voice,  visual,  and  tactile  interfaces  to  supplement  the  difficulties 
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and  advantages  of  each  modality  for  particular  situations.  The  caveat  is  that  the 
packaging  of  multimodal  solutions  will  have  to  become  feasible  (durable,  wearable, 
and  lightweight)  for  military  applications  (Hill  et  al.  2015). 

The  advantage  to  multisensory  interfaces  is  that  they  offer  both  practical  and 
performance  enhancing  methods  for  communication  between  IAs  (e.g.,  robots)  and 
humans  during  complex  military  operations  (Hancock  et  al.  2011;  Barnes  et  al. 
2014).  Nontraditional  interfaces  are  being  investigated  to  enable  human  control  of 
new  technologies  such  as  robotic  swarms  that  are  holistic  configurations  of  many 
agents  that  reconfigure  autonomously.  In  a  Brigham  Young  University  study  (Alder 
et  al.  2015),  a  swarm  is  being  developed  that  is  controlled  by  a  haptic  interface  that 
uses  pressure  and  directional  cues  to  “drag”  the  swarm  around  objects  representing 
buildings.  The  user  interface  is  designed  to  give  the  operator  intuitive  feedback 
during  the  drag  operation  that  does  not  require  “heads-up”  visual  cues.  Figure  8 
shows  the  haptic  interface  control  dynamics  necessary  to  move  the  swarm  around 
a  simulated  structure  (Alder  et  al.  2015). 


Fig.  8  Haptic  forces  to  move  a  robotic  swarm  in  the  desired  configuration  around  a  nominal 
structure  (Alder  et  al.  2015) 


6.  Summary  and  Discussion 

We  reviewed  human-agent  research  focusing  on  shared  decision-making  in  which 
humans  supervise  multiple  IAs  that  have  varying  degrees  of  autonomy  (Chen  et  al. 
2011;  Draper  2014).  The  scope  of  the  report  encompasses  the  following  5  areas  of 
mixed  initiative  decision-making  and  their  enabling  technologies: 
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1 .  Control  issues  related  to  using  IAs  as  intennediate  supervisors  to  monitor 
n-systems  that,  in  turn,  are  supervised  by  a  human  operator  (Chen  et  al. 
2011;  Chen  and  Barnes  2012a;  Chen  and  Barnes  2014). 

2.  Transparency  and  trust  issues  related  to  the  operator’s  understanding  of  the 
IA’s  intent,  reasoning,  and  projected  outcomes  (Chen  et  al.  2014;  Mercado 
et  al.  2016;  Selkowitz  et  al.  2016  in  press;  Stowers  et  al.  2016). 

3.  Shared  knowledge  structures  including  computational  cognitive 
architectures  that  enable  effective  collaboration  (Kelley  2014). 

4.  Language  processing  software  to  foster  2-way  communications  between 
agents  and  humans  (Giammanco  et  al.  2015;  Wang  et  al.  2016). 

5.  Specialized  interfaces  to  expedite  control  of  embodied  agents  using  gesture, 
voice,  and  haptic  controllers  (Pettit  et  al.  2013;  Bames  et  al.  2014;  Alder 
et  al.  2015;  Elliott  et  al.  2015,  2016). 

Progress  in  integrating  these  components  into  systems  that  approximate  human 
teams  is  encouraging,  but  the  state-of-art  is  still  very  much  in  the  research  phase 
(Chen  and  Bames  2014).  Needless  to  say,  there  are  important  issues  of  human- 
agent  collaboration  that  go  beyond  the  research  covered  in  this  report.  We  briefly 
review  some  of  these  areas  to  discuss  pertinent  issues  as  grist  for  future  research. 

Eventually,  autonomous  robots  will  be  used  for  multiple  functions  including 
security,  medical  uses,  maintenance,  and  transportation  as  well  as  combat  roles  that 
will  require  IAs  to  be  part  of  the  Soldier’s  daily  experience.  Especially  for 
embodied  agents  such  as  ground  robots,  IAs  must  fit  into  a  social  network  that 
requires  them  to  interact  within  human  moral  and  emotional  expectations  (Jones 
and  Schmidlin  2011;  Fiore  et  al.  2013).  Humans  and  robots  will  have  to  co-exist 
and  respect  each  other’s  space,  which  will  require  developing  an  etiquette  to  guide 
their  interaction  (Parasuraman  and  Miller  2004).  IAs  that  collaborate  with  operators 
on  a  personal  level  should  be  designed  to  express  and  respond  to  nonlinguistic  cues. 
Breazeal  (2003,  2009)  discussed  the  benefits  and  challenges  of  designing  an 
anthropomorphic  robot  (“Kismet”)  whose  facial  expression  is  able  to  convey 
emotional  cues  to  facilitate  a  more  natural  relationship  between  the  robot  and  its 
human  clients.  Poorly  designed  IAs  can  have  negative  effects  as  well,  causing 
distrust  and  reluctance  to  share  the  same  living  space  (Breazeal  2009). 

Arkin  and  Ulam  (2009)  believe  that  ethical  considerations  need  to  be  encoded  into 
robotic  architectures  to  ensure  that  autonomy  does  not  lead  to  dangerous  behaviors. 
During  military  operations,  autonomous  decisions  need  to  be  made  rapidly  and 
inflexible  rules  could  be  disastrous.  However,  as  long  as  software  rules  are 
transparent,  having  ethical  brakes  embedded  into  IA  architectures  will  give  its 
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human  supervisor  more  oversight  during  dangerous  situations  (Bames  and  Evans 
2010).  Scheutz  (2016  in  press)  argues  that  designing  robots  that  avoid  hanning 
humans  is  not  sufficient  (i.e.,  “implicitly  moral”  robot).  Robots  need  to  be 
“explicitly  moral”,  using  the  same  reasoning  process  humans  make  when 
encountering  a  difficult  moral  dilemma.  Obviously,  this  is  not  always  easy  for 
either  a  human  or  an  IA,  but  a  robot  can  signal  its  operator  that  the  actions  it  is 
being  directed  to  perform  have  ethical  consequences.  For  example,  Briggs  and 
Scheutz  (2012)  investigated  human  responses  to  a  robot  showing  discomfort 
concerning  destroying  simulated  buildings.  The  results  indicate  that  robots  can 
influence  their  human  supervisors’  ethical  decisions  through  nonexplicit  behaviors. 
Whether  the  simulated  advisories  would  be  effective  during  real-world  operations 
is  unclear.  However,  an  IA  advising  against  executing  an  unethical  or  dangerous 
option  would  remind  the  operator  of  the  consequences  of  the  proposed  actions 
(possibly  via  displaying  projected  outcomes  as  suggested  in  the  SAT  model-Level 
3  [Chen  et  al.  2014]).  Advisories  will  be  particularly  effective  if  it  is  clear  to  the 
operator  that  the  advisories  are  based  on  military  doctrine  (Endsley  2015b).  The 
trade-off  between  speed  and  control,  lethality  and  safety,  and  ethics  and  expediency 
are  important  issues  that  will  penneate  future  IA  research  (. Economist  2016). 

An  important  trend  in  agent  technology  is  the  greater  use  of  ML  methods.  ML  is 
not  a  single  approach  but  rather  subsumes  many  algorithmic  and  statistical  methods 
such  as  evolutionary  algorithms,  Bayesian  approaches,  adaptive  control  theory, 
neural  nets,  and  the  like.  Nilsson  (1998),  in  his  now  classical  textbook,  defined  ML 
as  “...  a  machine  learns  whenever  it  changes  its  structure,  program,  or  data  (based 
on  its  inputs  or  in  response  to  external  information)  in  such  a  manner  that  its 
expected  future  perfonnance  improves”.  There  are  2  principal  issues  related  to  ML: 
1)  the  underlying  algorithmic  approach  is  often  opaque  and  2)  the  reasoning  for 
choosing  COA  will  change  as  more  information  is  accumulated.  The  underlying 
logic  depends  on  the  technique  used.  For  example,  evolutionary  algorithms  use  a 
fitness  function  (e.g.  number  of  casualties)  to  choose  a  solution  at  each  iteration 
(Suantak  et  al.  2001).  In  this  case,  the  efficacy  of  the  ML  solution  can  be  described 
in  terms  of  its  expected  outcomes  (in  relation  to  its  fitness  function)  as  the  COA 
changes  over  time.  Because  of  the  dynamics  of  military  environments,  ML  will  be 
an  essential  tool  for  IA  technology  (Draper  2014).  Both  IAs  and  their  human 
counterparts  must  adapt  to  military  environments  that  are  in  a  state  of  continual 
flux.  However,  it  is  difficult  to  believe  that  humans  will  have  trust  in  a  system  that 
changes  its  preferred  solutions  unless  the  logic  and  expected  outcomes  driving  the 
changes  are  transparent  (Chen  et  al.  2014). 
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7.  Conclusions 


We  conclude  that  shared  decision-making  between  humans  and  IAs  shows  potential 
as  an  effective  means  of  addressing  the  complexities  of  modern  warfare.  However, 
more  research  is  needed  before  this  potential  is  realized.  The  human’s  and  the 
agent’s  perception  of  the  world  must  be  aligned  to  create  the  synergy  necessary  to 
take  full  advantage  of  their  respective  strengths  and  limitations.  This  requires 
transparency  of  both  the  human  and  the  agent’s  intent,  logic,  and  projected 
outcomes.  The  agent’s  software  architecture  must  support  both  bi-directional 
transparency  and  human-agent  communication.  Language  processing  from  simple 
commands  to  complex  inferencing  is  maturing  rapidly,  making  human-agent  teams 
that  can  communicate  with  each  other  feasible  in  the  near  future.  Future  research 
efforts  should  address  the  effects  of  emotions  on  human-agent  team  building, 
ethical  constraints  of  autonomy,  and  the  promise  and  perils  of  machine  learning. 
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DOD 

DSB 
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IMPACT 

L 

ML 

NLP 

SA 

SAT 

U 

UAV 

UGV 

UV 


US  Army  Research  Laboratory 
Autonomy  Research  Pilot  Initiative 
Autonomous  Squad  Member 
Controlled  English 
course  of  action 
US  Department  of  Defense 
Defense  Science  Board 
intelligent  (nonhuman)  agent 
Israeli  Defense  Force 

Intelligent  Multi-UxV  Planner  with  Adaptive  Collaborative/Control 
Technologies 

level 

machine  learning 
natural  language  processing 
situation  awareness 
SA-based  Agent  Transparence 
uncertainty 

unmanned  aerial  vehicle 
unmanned  ground  vehicle 
unmanned  vehicle 
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